104 research outputs found

    Optimising Network Architectures for Provable Adversarial Robustness

    Get PDF

    Partial Index Tracking: A Meta-Learning Approach

    Get PDF
    Partial index tracking aims to cost effectively replicate the performance of a benchmark index by using a small number of assets. It is usually formulated as a regression problem, but solving it subject to real-world constraints is non-trivial. For example, the common L1 regularised model for sparse regression (i.e., LASSO) is not compatible with those constraints. In this work, we meta-learn a sparse asset selection and weighting strategy that subsequently enables effective partial index tracking by quadratic programming. In particular, we adopt an element-wise L1 norm for sparse regularisation, and meta-learn the weight for each L1 term. Rather than meta-learning a fixed set of hyper-parameters, we meta-learn an inductive predictor for them based on market history, which allows generalisation over time, and even across markets. Experiments are conducted on four indices from different countries, and the empirical results demonstrate the superiority of our method over other baselines. The code is released at https://github.com/qmfin/MetaIndexTracker

    Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation

    Get PDF
    In reinforcement learning, domain randomisation is an increasingly popular technique for learning more general policies that are robust to domain-shifts at deployment. However, naively aggregating information from randomised domains may lead to high variance in gradient estimation and unstable learning process. To address this issue, we present a peer-to-peer online distillation strategy for RL termed P2PDRL, where multiple workers are each assigned to a different environment, and exchange knowledge through mutual regularisation based on Kullback-Leibler divergence. Our experiments on continuous control tasks show that P2PDRL enables robust learning across a wider randomisation distribution than baselines, and more robust generalisation to new environments at testing
    • …
    corecore